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Amendments to the Specification: 

Please replace the original specification with the attached substitute 
specification is attached along with a marked-up copy. 

Amendments to the Abstract: 

Please replace the original abstract with the following abstract. 
ABSTRACT 

D i sc l os e d i s a AJext mining method with st e ps is provided for separating high 
frequency information and low frequency information and applying an ideal analysis 
method to each kind of information. Negative expressions and modality expressions 
are extracted from the low frequency information to assist in extracting valuable 
knowledge for risk management. T e xt c l ass i ficat i on technology by tho conv e nt i ona l 
k e y word m e thod is su i tab l o for e xtracting and class i fy i ng high frequency knowlodgo 
but e xtract i ng Extracting valuable information for risk management or from the actual 
customer voice in toe_a call center text database requires e xtract i ng tho e ssential 
valuable knowledge to be extracted from vast quantities of ordinary information. This 
method has a funct i on functions to hold in a folder tbe_a document found by a 
keyword search, and a funct i on to store the remaining text into a low frequency 
information folder, after having stored the high frequency information found by 
keyword search. A function is also provided for extracting modality expressions that 
express negative expressions and modalities as a unit , so as to extract valuable 
knowledge for risk management from low frequency information. 
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TITLE OF THE INVENTION 

INFORMATION PROCESSOR AND PROGRAM FOR IMPLEMENTING 
INFORMATION PROCESSOR 

5- BACKGROUND OF THE INVENTION 
Fi.gld of the Invention - 

The present invention relates to a text mining method 
for extracting knowledge from text in^natural language and 
is mainly used for analysis in fcte^call center text database. 

10 Description of Roland All A , 

Text classification systemsjusing keywords^specif ied 
by ^user, assist in classifying text/ by detecting and 
displaying keywords as viewed from their lack of use (or 
keywords not used in a category) based on the frequency that 

-15 the keyword appears in the text (See for example, patent 
document 1 ) . 

The unit for extracting valuable knowledge for risk 
management focuses on expressions such as ";^:tL (rude) " or 
(disappointment) " . In this method for extracting 
20 negative expressions, keywords having negative meanings^ 
such as "^^(lost order)" or "?irffif (complaint) "^are preset 
according to their domain, a searcHjmade, and^if a hit occurs^ 
an alert is issued. There are also text classification 
systems possessing^unit allowing the user to rewrite a 
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keyword dictionary for the text category (See for example, 
patent document 2) . 

[Patent document 1] JP-A No. 101226/2001 

[Patent document 2] JP-A No. 184351/2001 . \ i t 

5 Text classification technology <5 tnc related art ^is 

suitable for extracting and categorizing high-frequency 
knowledge. However, extracting valuable information for 
risk management and the actual voice of the customer from 
the call center text database by extracting low frequency 

10 knowledge is extremely important. In other words, it is 
important to efficiently^and without omissions , extract -fetee 
essential valuable knowledge from among a vast quantity of 
ordinary information. An object of the present invention 
is to create FAQ (frequently asked questions) based on a high 

15 frequency of inquiries and to extract valuable information 
for risk management from a low frequency (low number) of 
inquiries. Analyzing text (or text mining) for risk 
management uses the technique of extracting negative 
expressions. In the method feac^extracting negative 

20 expressions, keywords such as "rude" or "disappointment" 
are preset and a search^made. However, this method has the 
problem that setting the keywords in advance requires much 
time and effort, covering all items is impossible and many 
omissions occur. 
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SUMMARY OF ( THE INVENTION 

To resolve the above-mentioned problems, of the related 
. art, the text mining system of the present invention employs 
a. method fe©^ extracting low frequency information having a 

5 function fjax^ 'extr acting arid storing high frequency. 

information in a folder, and then gathering the remainder 
of the text and storing it in a low frequency information 
folder. The system of the present invention further has a 
unit to eliminate noise and omissions in the extraction of 

10 negative expressions from data in the low frequency 

information folder by extracting candidate negative words 
from the target text by utilizing a dictionary storing 
characters having negative meanings^ such as n ^:(lose)" or 
"^H (negative) " , and after registering words determined to 

15 be negative words in the negative word dictionary, using 
this negative word dictionary to extract the negative 
expressions. 

The present invention is capable of sorting 
information in call center text database (hereafter, 

20 reply log) into high frequency information and low frequency 
information, r emte^gwpg ^the effect that text mining methods 
can be applied to each type of information. Sorting the high 
frequency information into topics assists in creating^ FAQ . 
Information valuable for risk management can be extracted 
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by viewing low frequency information in terms of negative 
expressions and modality expressions. 

The negative expression extraction method of the; 
present invention has the effect of preventing omissions 

5 during extraction by using characters as clues to extract 
candidate negative words contained in the target text for 
analysis (mining) . The task of judging whether the 
candidate negative words that were extracted are negative 
words must be performed by human effort. However , words 

10 determined to be negative words are accumulated in the 

negative word dictionary and Ithe stop word dictionary for 
extracting-negative-words^ so /the invention jjumAuj- B j|the 
further effect that the number of candidate negative words 
are gradually narrowed down through the process of 

15 repetition. 



BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is a block diagram of %4?e^embodiment of the text 
mining system of the present invention; 

FIG. 2 is a drnuT i nrj ^showing the data structure of ^fee^ 
call center text database; . 

FIG. 3 is a rirntiTiing^showing the data structure of an 
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association thesaurus storage section; 

FIG. 4 is a draw - ing ^showing the data structure of a 
25 term vector storage section; 
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FIG. 5 is a diuiyjiiAy^showing the data structure of a 
thesaurus overview storage section'; 

FIG. 6 is. a drawing ^showing the data structure of a 
display interface for text classification; 
5 FIG. 7 is a flow chart showing the procedure for 

generating data for thesaurus browsing; 

FIG. 8 is a flow chart showing the procedure for 
thesaurus browsing; 

FIG. 9 is a flow chart showing the text classification 
10 procedure; . 

FIG. 10 is a dimw i wig ^showing the data structure of a 
text folder; 

FIG. 11 is a drawing^showing an example of a negative 
word identification screen; 
15 FIG. 12 is a ^drawing showing the data structure of a 

negative character dictionary; 

FIG. 13 is a jjdrawiftg showing the data structure of a 
negative word dictionary; 

FIG. 14 is a ^rawing showing the data structure of a 
20 stop word dictionary for extracting negative words; 

FIG. 15 is a j ^aegscxpg showing the data structure of a 
modality expression dictionary ; 

FIG. 16 is a ^ drawing showing the data structure of a 
stop word dictionary for extracting modality expressions; 
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FIG. 17 is a flow chart showing the procedure for. 
extracting candidate negative words; 

FIG. 18 is a. flow chart showing the procedure for 
generating a negative word dictionary; 
5 FIG. 19 is a flow. chart showing the procedure for 

extracting modality expressions ; 

FIG. 20 is a flow chart showing the procedure for 
generating a modality expression dictionary; and 

FIG. 21 is a flow chart showing the procedure for 
10 extracting negative expressions and modality expressions, 



DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

■rfre 1 embodiments of the present inventionraee 4 a 

described grarrt - . — The embodiment pfr-tefere in 0 in Li uh j \s» a text 
15 mining system for call center text databases. The 

^ jftt* y ^^^^ 

embodiments &*a&^described in detail^ while i'af ciring to the 
accompanying drawings . 
(System structure) 

FIG. 1 is a block diagram of feke^first embodiment of 
■te&^text mining system^ the present invention. This system 
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comprises a CPU 101 , an input device 102 ., a display 103, a 
call center text database 104 , a data storage section for 
thesaurus browsing 105, a text folder 106, a data storage 
section for extracting low frequency knowledge 107, and a 
25 memory 108. The data storage section for thesaurus browsing 



105 comprises a storage section f or/associatioh thesaurus 
1051, a storage section for term vectors 1052, and a storage 
section for^thesaurus overview 1053. The data storage 
section for extracting low frequency knowledge .107 
comprises a negative character dictionary 1071 for 
implementing extraction of negative expressions , a negative 
word dictionary 1072, a stop word dictionary 1073 for 
extracting negative words , a modality expression dictionary 
1074 for implementing extraction of modality expressions, 
arid a stop word dictionary 1075 for extracting modality 
expressions. The memory 108 comprises a thesaurus browsing 
data generator unit 1081, a thesaurus browser processing 
unit 1082, a text retrieval unit 1083, a candidate negative 
word extraction unit 1084, a negative word dictionary 
generator unit 1085, a modality expression extraction unit 
1086, and a modality expression dictionary generator unit 
1087. 

(Call Center Text Database) 

FIG. 2 is a ^ckawing showing the data structure of the 
call center text database 104. A conversation (inquiry) ID 
.1041, a transcript of conversation 1042, a retrieval flag 
1043 showing that keyword retrieval is complete, and a 
classifying flag 1044 showing that sorting into the 
classification folder is complete are recorded in each 
record of the call center database 104. 
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(Thesaurus Browsing Function) 

The system of this invention contains a thesaurus 
browsing function to assist in extracting documents 
containing valuable information. Here, a thesaurus is a 
5 network expression showing distinctive (characteristic) 
words within a document collection and their relation. The 
thesaurus browsing function of this system comprises a 
function^to automatically create a thesaurus from a document 
collection, and a function^to show an overview and^detailed 

10 view of the thesaurus (overall display - zoom, display ) . The 
automatic creation of the thesaurus and the thesaurus 
display are. implemented by the thesaurus browsing method 
disclosedjf or example^ in. JP-A No . 227917/2000. The overall 
concept of the data and processing procedures for 

15 implementing the thesaurus browsing function of this system 
a^e^idescribed next. The data for implementing the thesaurus 
browsing function ^ io* f i^crfc . described^. The thesaurus 
browsing data storage section 105 comprises an association 
thesaurus 1051 , a term vector storage section 1052 , and a 

20 thesaurus overview storage section 1053. 

The association thesaurus created f^om document data 
in the transcript of conversation 1042 ofjicall center text 
database 104 is stored in the association thesaurus 1051. 
The association thesaurus shows the relation between one 

25 word and another word. In this embodiment, the association 
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level expresses how easily co-occurrence may happen in two 
words. The association level is. based on the frequency at 
which each word occurs and the co-occurrence frequency 
(frequency at which the two words appear simultaneously 
5 within a certain, range in the text).. FIG. 3 shows the data 
structure of the association thesaurus 1051. The 
association thesaurus 1051 comprises a record ID. 10511, a 
term X 10512 , a term Y 10513 , and an association level 10514 . 
Related terms are stored in the term X 10512 and the term 

10 Y 10513, and their association level is stored in the 
association level 10514. 

Term vectors extracted from document data stored in 
the transcript of ^conversation 1042 of/call center database 
104 are stored in the term vector storage section 1052. 

15 Here, term vectors ^tare^the numerical weight of terms in a 
document and can be extracted by utilizing the tr-idf method 
(Term Frequency Inverse Document Frequency) described in 
"Salton, G. , et al . : A Vector Space Model for Automatic 
Indexing, Communications of the ACM, Vol. 18, No. .11 (1975) . 

20 This tf-idf method is most well known at the^text indexing 
method. In this method, a value found by multiplying the 
frequency that the subject term appears, in a document (tf) 
by its inverse or inverse document frequency (idf) is set 
as the weight of the term in the target document^and terms 

25 with a high weight (in other words, key terms) are extracted 
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and set as the term 1 vectors'. FIG. 4 shows, the data structure 
of the term vector storage section 1052. The term vector 
storage section 1052 comprises a record ID 10521, a 
conversation ID 10522 and a key term list 10523. An ID for 
5 the text log (response log) stored in the call center text 
database 104 is stored in the record ID 10521. A list of 
high-weighted (important) terms appearing in A transcript of ^ 
conversation of the applicable text log are stored in the 
key term list 10522. 
10 An overview of the association thesaurus in the 

association thesaurus storage section 1051 is stored in the 
thesaurus overview storaqe section 1053. Here, the 
thesaurus overview is (^representative terms extracted as the 
most characteristic terms within the document collection, 
15 and representative terms with a strong association are 
summarized into a term cluster. FIG. 5 shows the data 
structure of the thesaurus overview storage section 1053. 
The thesaurus overview storage section 1053 comprises a term 
group number 10531 and a term list 10532. A list of terms 
20 belonging to the term cluster is stored in the term list 
10532. 

The thesaurus browsing data has now been described. 
The procedures for generating thesaurus browsing data 
and thesaurus browsing processing for implementing the 
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thesaurus browsing functions -sxsjdescribed next^«i»f the 
flow charts in FIG. 7 and FIG. 8. 

(Procedures for Generating Thesaurus Browsing Data) 

Thesaurus browsing data is first of all ioadtc ^to - 

5 prepare the analysis environment. The process for 

generating thesaurus browsing data, as shown in FIG. 7, 
comprises the steps of generating an association thesaurus 
(step 7 01) showing the term and term association level from 
each document; extracting term vectors from each document 

10 (step 7 02) ; and generating a thesaurus overview (step 703) . 
The thesaurus overview extracts the most characteristic 
terms within the document collection representative terms , 
and summarizes representative terms with a strong 
association into a term cluster. The representative term 

15 process sets key terms made up of term vectors and important^ 
in each document, as the representative terms. The term 
cluster generation process summarizes terms with a high 
association (association level) into one cluster based on 
the association level between terms^s to^c in the association 

20 thesaurus. 

(Thesaurus Browsing Processing Procedure) 

In the thesaurus browsing processes shown in FIG. 8 r 
the thesaurus overview stored in the thesaurus overview 
storage section 10 53^«& for example^ displayed to the user^ 

25 as shown in thesaurus overview display 602 in FIG. 6 (step 



I- 



-12- 



801) . The thesaurus overview display 602 comprises a term 
list display 6021 and a select button 6022. The term list 
10532 stored in the thesaurus overview storage section 1053 
is displayed on the term list display 6021. If the user next 

5 selects the term cluster list 6021 using^for example, a 
select button as an input unit 6022, and commands zoom with 
the zoom button 6033 (step 802) , the user then acquires 
associated terms of terms belonging to the term cluster on 
the association thesaurus 1051 (step 803) . These terms are 

10 set as a clustering (step 804) and the generated term 
clusters are displayed on the association term cluster 
display 604 (step 805) . If the user commands the termination 
of thesaurus browsing (step 806) , then the processing ends, 
and^if there, is no command from the userjthen the process 

15 returns to step .802 . During the zooming cgnu ynd in step 802 , 
if the user selects the term cluster 6041 ^displayed on 
association term cluster display 604 by using the select 
button 6042 and commands zooming with the zoom button 6033, 
then words associated with that association term cluster are 

20 displayed on the association term cluster display 604. If 
the user clicks on a termidisplayed on the thesaurus overview 
display 602 or association term cluster display 604 and then 
clicks the zoom button 6033 , then words associated with each 
term are displayed on the association term cluster display 

25 604. The user can command how many clusters to separate the 
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terms Into or what, terms to extract into one cluster by 
selecting (clicking) the Number of Clusters 6031 and the 
Number of Terms in each Cluster 6033.. 
(Benefits of Thesaurus Browsing) 
5 A function to search for (retrieve) key words in a text 

and a function to store text in a folder allows the user to 
extract terms associated with words the user^entered as key 
words and store them for creating^FAQ . Also, a thesaurus 
can be created from, the overall text database (text or 

10 transcript reply log). , and a thesaurus browsing function^ 
provided allowing the user to navigate to a portion of the 
thesaurus containing terms the user selected after checking 
a thesaurus overview showing the overall thesaurus 
structure, thus making it easy for the user to hit upon 

.15 (conceive) keywords. Checking the thesaurus overview makes 
it easy for the user to acquire^-s^gxasp* of topics within the 
document collection. Viewing the array of representative 
terms^summarized into one term cluster ^a-lluwu porc e izzi^g the 
topic and its contents . Setting terms associated with a term 

20 on the cluster display (display summarizing terms with a 
strong correlation as term clusters) assists in tt . 

conjecturing on the topics, sub-topics and their contents^ 
linked to that term. 

The system of the present invention^ contains a 

25 thesaurus browsing function and key word text retrieval 
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function allowing the. user to extract text containing high 
frequency information and^s.tore it in a classification 
f older^and^further contains^ another function to collect the 
remaining text into a low frequency information folder... 
5 FIG.. 6 shows^the layout of the display interface for text 
classification (or text classification display) . The. text 
classification display 601^as shown in FIG. 6, comprises a 
thesaurus overview display 602 for thesaurus browsing, a 
thesaurus zooming function 603, an associated term cluster 

10 display 604, a text retrieval command section 605 for 

keyword text retrieval, a text retrieval result display 606 
and a text save section 607 for saving the text category. 

The thesaurus overview display 602 comprises a term 
list display 6021 arid a Select button 6022 . A term list 10532 

15 stored in the thesaurus overview storage section 1053 is 
displayed on the term list display 6021. The thesaurus 
zooming function 603 is made up of a Number of clusters 6031, 
a Number of terms in each cluster 6032 and a zoom button 6033 . 

The associated term cluster display 604 is made up of 

20 a term list display section 6041 and a select button 6042. 

The text retrieval command section 605 is made up of 
a search term entry box 6051 and a search button 6052. The 
text retrieval result display 606 is made up of a text 
display 6061 and a text select button 6062. The text save 
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section 607 is made up of a folder name display 6071 and a 
folder select button 6072. 
(Text Classification . Procedure) 

The system of the present invent ion^e^ nta ins a 

5 function to collect the remaining text information and store 
it in a low frequency information folder after extracting 
the text containing high frequency information and storing 
it in a folder. FIG. 9 is a flow chart showing the text 
classification procedure of the present system. The text . 

10 classification procedure of this s-ys temp le n^At described^ 
using the text classification screen of FIG. 6 and the flow 
chart of FIG. 9. When a start classification command is 
issued (step 901) , the call center text d aj ^ase 104 is 
accessed and a retrieval flag 1043 showing /retrieval is 

15 complete and a classification flag 1044 showing 

classification is complete are reset to "0" value. When the 

user enters a. term into the search term entry box 6051 and 

clicks the search button 60 52 to command^key word text search 

<\ 

(retrieval) (step 903) , the transcript of j conversation , 
20 (reply log memo) 1042 of^call center text database 104 makes /\ 
a text retrieval (search) for a corresponding key word (step 

904) , the retrieval flag 1043 of^call center text database 
104 is set to w l" to show that retrieval is complete (step 

905) , and the text retrieval results are displayed in text 
25 display 6061 of/ytext retrieval result display 606 (step 
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9 06) . When the user wants to save « text from the text 
retrieval result list and clicks the text select button 6062 
and folder select button 6072 (step 907). , the selected text 
is saved in the text save folder 106 (step. 908) , and the 
classification flag 1044 in the call center text database 
104 is set to w l" to show that classification is complete 
(step 909) . 

If the user commands that classification end (step 
910) , text with a retrieved flag of "0" is stored in the low 



10 frequency information folder (j^ll) . 

A- 

The method fes*^ storing text into the low frequency 

information folder may also function so that text with^ 

retrieved flag of w 0" is stored in the low frequency 

information folder. A select flag may also be prepared in 

Ah 

15 the text save folder so that text, other thanj\text whose 

classification is specified by the user as complete, arre^ 
saved in the low frequency information folder. Further, 
instead of a retrieved flag and a classification complete 
flag showing that retrieval and classification ^complete, 

20 the retrieval count and classification counts may be updated^ 
and text with a value lower than a retrieval count and 
classification count threshold may be stored in the low 
frequency information folder. 

The system of the present invention^ contain s a 

25 thesaurus browsing function to assist in remembering key 
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words. The user can make a search^the text for a key word 
by selecting a term displayed during the thesaurus browsing 
process. Cli ckin g- on a term displayed in the term list 
display 6021 of^thesaurus overview display 602 copies that 
5 term into the search term entry box 6051. Clicking the 
select button 6022 of^thesaurus overview display .602 copies 
all terms displayed in. the term list display 6021 into the 
search term entry boj^ 6051 , In the same way, clicking on 
a term displayed iri/sterm list display section 6041 of 

10 association term cluster display 604 copies that term into 
search term entry box 6051, and clicking the select button 
6042 copies all terms displayed in^term list display section 
6041 into the search term entry box 6051. Terms appearing 
within the overall transcript (reply log) are linked (given 

15 associations) and stored. Thesaurus browsing therefore 
allows Acollecting and classif yincfyhigh frequency 
information. 

(Extracting Knowledge from Low Frequency Information) 

The system of the present invention can collect text 

20 never retrieved in the period from the start to finish of 
classifying, or text not classified into any folder, and 
store it in a low frequency information folder. Here, terms 
possessing negative meanings^ such as yy ^L (rude) " and "3*c 
M(di sappointment) ", or modality expressions such as w < 

25 (won't you give)", <b (originally) " , 
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<£>#Mwhy can't you) " f . and wl §fc bV^ (want) " serve as effective 

indicators when analyzing text for the purpose of risk 
A 

management. As Aunit for extracting knowledge from low 
frequency information ^valuable for risk . management , a 

5 function^for extracting negative expressions and a function^ 
for extracting^modality expression showing a customer or an 
operator modality are provided. An overview of the 
procedure for extracting text containing negative 
expressions and modality expressions from^ transcript of 

10 conversations (reply log memo) stored, in low frequency 

information folders iae^described next^ u jjirrg the flow chart 



in FIG. 21. First of all, candidate negative words and 
candidate modality expressions are extracted from the 
transcript of conversations (reply log memo)^stored in low 

15 frequency information folders (step 2101) . Selections made 
by the user from these candidate negative words and 
candidate modality expressions are next registered in the 
negative word dictionary and modality expression dictionary 
(step 2102) . Finally, a key word search (or retrieval) is 

20 mad ca using the terms registered as key words in the negative 
word dictionary and modality expression dictionary as the 
key words (step 2103) , and -fete text containing negative 
words and modality expressions stre^extracted and the 
contents/checked (step 2104) . 
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The procedure for extracting negative expressions and 
modality expressions £s ^described next. 
(Extracting Negative Expressions) 

The present system contains a unit for extracting 
5 negative expressions from 1*ey\transcript of conversations 
(reply log memo) . This unit comprises a negative word 
candidate extraction function for extracting negative word 
candidates from the transcript of conversations (reply log 
memo) <f and a negative word dictionary creation function for 
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re gist ering words among the candidate negative words 
duuidud^by the user to be negative words. To implement these 
functions, the present system comprises a negative 
character dictionary 107 ^registered with ohas u aotorc that 
tend (high probability) to comprise elements of negative, 
words.such as (lose) " , "A (negative) " , and "35(slow)"; a 
negative word dictionary 1072^registered vfifefe=w^ds A already (d±$ 
determined to be negative words; and a stop word dictionary 
(for extracting negative words) 1073^registered with words^ 
alreadyAdetermined. not to be negative words. 



FIG. 12 shows the data structure of the negative 
character dictionary 1071. As shown in FIG. 12, each record 
of the negative character dictionary contains an ID record 
10711, a Negative character 10712, a Negative level 10713, 
a Number of words registered in negative word dictionary 
25 10714, and a Number of words registered in stop word 



-20- 



dictionary (for extracting negative words) 10715. The 
Number of words ^negative word dictionary 10714 holds the 
number of words containing the target negative character 
among words registered in the negative character 
5 dictionary, the Number of words registered in stop word 
dictionary 10715 holds the number of characters containing 
the target negative word from among words registered in the 
dto^ word dictionary 1073 (for extracting negative words) , 
the negative level 10713 holds a value of 0 or 1 showing the 

10 percentage of words registered in the negative word 

dictionary from among words extracted as candidate negative 
words. The value of this negative level may also be set as 
desired by the. user. FIG. 13 shows the data structure of 
a negative word dictionary 1072 . Each record of the negative 

15 word dictionary jyfeeids a record ID 10721 , a Negative word 
10722, and a Negative level 10723. The Negative level 10723 
holds H values for the negative level 10713 recorded in the 
negative character dictionary. FIG. 14 shows the data 
structure of the (negative) stop word dictionary (for 

20 extracting negative words) 1073. Each record in the 

(negative) stop word dictionaryjktsrtds a record ID 10731 and 
a Stop word for extracting negative words 10732. 

The P rocedure f °^J^ ^^j * candidate negative words 
^described next ^hilo vcf^t^^ ^to the flow chart FIG. 17. 

25 First, all words appearing in the transcript ofj^conversation 
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(memo) 1042 are extracted and a word list/^ created (step 

1701) . One word is loaded from the word list (step 17.03) , 
A. 

asfl a search^made of the negative character dictionary 1071, 
and whether or not the word contains negative characters is 
5 decided (step 1704) , If the word contains negative 

characters, then a search is made of the negative word 
dictionary 1072, and a check ( decision )^ made jif the word is 
already registered in the negative word dictionary 1072 
(step 1075) . ' Ifjalready registered in the negative word 

10 dictiona^ 1072, then it is already known to be a negative 
word, soothe word is not extracted as a candidate negative 
word^and processing related to this word is terminated. If 
the word is not registered in the negative word dictionary 
1072^then a search is made of the (negative) stop word 

15 dictionary 1073, and whether or not the word is. already 
registered in the (negative) stop word dictionary 1073 is 
decided (step 1706) . If^registered in the (negative) stop 
word dictionary 1073* then it is already known not to be a 
negative word^so^the word is not extracted as a candidate 

20 negative word and processing related to this word is 

terminated. The word is then registered in the candidate 
negative word list (stop 1707) , if foundjto be not registered 
in the negative word dictionary and not registered in the 
(negative) stop word dictionary. By performing this same 

25 processing on all words registered in the word list, of those 
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words containing negative characters/ those words not 
registered in the negative word dictionary and those words 
not registered in the (negative) stop word dictionary, can 
be registered in the candidate negative word list. 
5 The procedure for creating the negative word 

dictionary is ^described nextftuhilig git f ia lining to the flow 
chart, of FIG. 18. First of all, to^ docido if the candidate 
negative word is a negative word or not, the candidate 
negative word list is displayed on the screen (step 1801) . 

10 A typical negative word check screen is shown in FIG. 11. 
The negative word check screen contains a Candidate negative 
word display 11011, a Words registered in negative word 
dictionary display 11012, a Words registered in stop word 
dictionary (for extracting negative words) display 11013, 

15 and a Register button 11014. The Words registered in 

negative word, dictionary display 11012 and Words registered 
in stop word dictionary (for extracting negative words) 
display 11013 are displayed as reference information for 
making a decision^ butjmay be omitted. The user decides 

20 whether or not the candidate negative word displayed in the 
Candidate negative word display 11011 is a negative word and 
enters a check mark on that word if/determined to be a 
negative word (step 1802) . When the user clicks the Register 
button 11014 (step 1803) , the word determined to be a 

25 negative word is registered in the negative word dictionary 
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(step 1804) . Wherijdetermined not to be a negative word, that 
word is registered in the stop word dictionary (step 1805) . 
(Extracting Modality Expressions) 

The function for extracting modality expressions . 
showing the customer and operator modality is^described 
next, FIG. 15 shows the data structure of the modality 
expression dictionary 1074 . Each record in the modality 
expression dictionary contains a Record ID 10741 , a Modality 
expression 10742, a Part of speech 10743, and a Modality 
10744. FIG. 16 shows^the data structure of the modality 
expression stop word dictionary 1075. Each record in the 
modality expression stop word dictionary contains a Record 
ID 10751, a Modality expression stop word 10752 and a Part 
of Speech 10753 . 

The procedure for extracting the candidate modality 
expression S ^described next^ wh^lc vt^±^c rin g to the flow 
chart in FIG. 19. First, all words appearing in the 
transcript of conversation (memo) 1042 are extracted and a 
word list^created (step 1901) . One word is loaded from the 
20 word list (step 1903) , and if the part of speech is a helping 
verb (step 1904) , then the process proceeds to^extracting 
the candidate modality expression. In other words , a search 
is made of the modality expression dictionary 1074^and 
whether or not the word is registered irynodality expression 
25 dictionary 1074 is decided (step 1905) . If ^registered in 
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the modality expression dictionary 1074, then it is already 
known to be a modality expression^ so /the word^s. not 
extracted as a candidate modality expression^ and^prqcessing 
related to that word ends. If^not registered in the modality 
expression dictionary 1074 ,. then a search is made of the 
modality expression stop word dictionary 1075, and whether 
or not the word is registered in the modality expression stop 
word dictionary 1075 is decided (step 1906) . If/registered 
in the modality expression stop word dictionary 1075sthen 
it is already known not to be a modality expression^so^the 
word is not extracted as a candidate modality expression and 



10 



processing related to that word ends. Words^not registered 
in the modality expression dictionary and alsoj(not 
registered in the modality expression stop word dictionary J 

15 are then registered in the candidate modality expression 
list (step 1907) . By performing the same processing on all 
words registered in the word list, those words whose part 
of speech is an adverb or helping verb and that are not 
registered in the modality expression dictionary and 

20 modality expression stop word dictionary are then 

registered in the candidate modality expression list. 

The procedure for creating the modality expression 
dictionary » ^described next ftjhilc frcf or ring to the flow 
chart in FIG. 20. The candidate modality expression list 

25 is first of all displayed (step 2001) to determine whether 
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or not the candidate modality expression is a modality 

expression. A modality expression check screen is used that 

is the same as the negative word check screen of FIG.. 11. 

The usJl^lecides if the candidate modality expression 

5 displayed on the screen is a modality expression or not and 

places a check mark ori^"b*» word ^ docidod to be ^modality 

expression (step 2002) . When the user clicks the Register 

button (step 2003) , the word ^ decide d to be a modality 

expression is registered in the modality expression 

jb^Jt- J^a^ J LiApc^>J 

10 dictionary (step 2004) . Wordsj, decided not to be modality 
expressions are registered in the modality expression stop 
word dictionary (step 1805) . 



